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1.  Introduction  and  Summary 


The  purpose  of  the  present  paper  is  to  illustrate  the  concept  of  an 
extreme  family  as  defined  by  Lauritzen  (197*0  and  to  define  a  class  of 
statistical  models  for  discrete  observations  generalizing  classical 
exponential  families . 

In  the  classical  formulation,  a  discrete  exponential  family  is  a 
family  of  probability  measures  (Pq,  0£@)s  where  the  parameter  space  ©  is 
a  subset  of  k-dimensional  Euclidean  vector  space,  the  probability  function 
being  given  by 

k 

y  6.t. (x) 

PQ(x)  =  a(e)  b(x)  e1_1  (l.l) 


where  x£E  ,  a  discrete  set,  t±  are  real  valued  functions  and  6  =  (e^...,© 

If  one  observes  independent  identically  distributed  random  variables  X  , . .  .  X 

1  ’  n 

with  the  common  probability  for  X^^  given  by  (l.l),  the  joint  probability  will 
be  given  by 


>xn )  =  a(e)n^  n  b(x.  )J 


^(x.) 


(1.2) 


Now,  Pgn^  is  again  an  exponential  family  with  the  same  parameter  space  as 
before.  Somehow  this  is  not  a  coincidence.  If  we  try  to  look  closer  at  the 
elements  of  the  exponential  family,  we  might  understand  this  fact. 

The  function  b  is  a  common  reference  measure  defining  the  support  of 
the  measures  (PQ,  0€<9),  and  a(0)  is  a  normalizing  constant.  The  functions 
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(tx, are  the  sufficient  statistics,  and  as  an  experiment  is  repeated, 
the  sufficient  statistics  for  the  combined  experiment  is  obtained  as  the  sum 
of  the  sufficient  statistics  for  the  experiments  in  the  repetition: 


t: 


(n) 


-  (x1, . . .,xn)  =  ti(x1)  + 


.  +  t .  (x  ) 
i  n 


(1.3) 


The  reason  for  this  is  that  the  function 


t  e.t. 

/Li  11 


Sa  *  ("t-,  ,  •  •  •  ,  t,  )  e 


i=l 


is  a  homomorphism  of  the  range  space  of  (t^...,^)  into  the  group  ((0,°°),*) 

se  +  (s1»,",sk))  =  ge (^i* •  •  •»tk^)  *  se (^si» •  •  *’sk))  •  (i-^) 

The  idea  in  this  paper  is  that  most  results  about  exponential  families 
essentially  are  based  on  the  above  properties  only.  We  shall  therefore  try  to 
define  a  class  of  families  of  distributions  via  these  properties. 

If  we  again  look  at  (l.l),  (1.2)  and  (1.3)  we  see  that  we  never  substract. 
In  fact,  we  only  use  that  the  algebraic  operation  +  is  associative  and 
commutative.  A  set  with  a  composition  which  is  associative,  commutative  and 
has  a  unit  is  called  a  commutative  monoid,  cf.  Bourbaki  (1970).  Commutative 
monoids  are  so  simple  that  they  are  not  studied  very  much.  Therefore,  we 
have  to  establish  some  of  the  simple  results  about  these  ourselves,  which  is 
done  in  Section  2. 

Let  us  consider  another  family  of  distributions  (P  060),  where 

©  =  {l,2,...,}  and 
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pe  ^  e  *  x{i, . . .  ,0}  ^  5 

where  x£E  =  {l,2,...}  and  xA  is  the  indicator  function  of  the  set  A, 

i  .e. 


XA(x) 


if  x  e  A 
otherwise  . 


(1.5) 


Consider  X  , .  ..,X  independent  indentically  distributed  as  above.  Their 
joint  probability  is  given  by 


P(n)  (x 

0  vxl» 


-xn}  =  ^x{i,...,o}  (mx{xr 


’xn}) 


(1.6) 


The  same  situation  as  before  is  actually  present  if  we  replace  t(x^)+ 
t(x2)  ty  max{t  (x1),  t  (Xg)}.  Just  gQ  (x)  =  0j(x)  can  turn  out  to 

be  zero  this  time  whereas  exponentials  are  always  strictly  positive.  The 
support  of  the  measures  (p  ,  0€0)  in  this  last  example  is  varying  with  060  , 
which  is  not  the  case  in  the  first  example. 

In  section  7,  part  I  of  Barndorff-Nielsen  (1973),  there  is  a  detailed 
discussion  of  problems  connected  to  maximum  likelihood  estimation  in  exponen¬ 
tial  families.  The  maximum  likelihood  estimator  in  regular  canonical 
exponential  families  is  shown  to  exist  iff  the  observation  happens  to  be  so, 
that  the  value  of  the  sufficient  statistic  falls  within  the  interior  of  the 
convex  hull  of  the  support  of  the  measures  in  the  family,  transformed  by  the 
sufficient  statistics.  This  means  that  if  the  boundary  of  this  convex  hull 
has  positive  probability,  one  might  very  well  get  an  observation  from  which 


it  is  impossible  to  estimate.  To  solve  this  problem  it  is  proposed  there 
to  make  a  statable  extension  of  the  model,  the  extension  being  defined  for 
families  where  the  set  of  possible  values  of  the  set  of  sufficient  statistics 
is  assumed  to  be  finite.  The  extension  is  called  the  completion  of  an  exponen¬ 
tial  family. 

The  measures  in  the  completion  of  an  exponential  family  have  certainly 
their  s*upport  vaiying  with  the  parameter,  and  the  "fixed  support"  property 
does  therefore  not  seem  to  be  essential  to  the  nice  results  existing  for 
exponential  families . 

The  families  defined  in  the  present  paper  are  shown  to  be  "complete"  in 
the  sense  that  the  maximum  likelihood  estimator  of  the  parameters  always 
exist. 

In  section  3,  the  families  are  defined  and  some  examples  are  discussed. 

In  section  1*  we  show  the  existence  and  uniqueness  of  the  maximum  likelihood 
estimate  of  the  unknown  parameter  in  such  families . 

In  section  5  we  show  that  the  family  of  Markov  chains  made  up  by  sequences 
of  sufficient  statistics  from  successive  independent  repetitions  of  an  experi¬ 
ment  giving  rise  to  a  general  exponential  model,  is  in  fact  an  extreme  family 
of  Markov  chains  as  defined  by  Lauritzen  (197^). 

In  section  6  we  shall  briefly  discuss  the  relation  between  the  models 
defined  in  the  present  paper  and  the  completion  of  an  exponential  family  as 
defined  by  Bamdorff-Nielsen  (1973). 

2 .  Commutative  Monoids 

The  algebraic  structure  of  commutative  monoids  will  play  an  essential  role 
in  the  present  paper.  We  shall  quote  the  definition,  cf.  Bourbaki  (1970 ). 
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Definition  2.1  Let  M  be  a  set  and  *  a  composition  rule  on  M.  (M,#) 
is  said  to  be  a  commutative  monoid  if  *  is  associative,  commutative  and 
has  a  unit,  i.e.,  if 

i)  \/a,b,ceM:  a  *  (b*c)  =  (a*b)  *  c  , 

ii)  \/a,b  eM:  a  *  b  =  b  *  a  , 

iii )  j  aeM:  VaeM:  e*a=a*e  =  a 

As  we  shall  only  consider  commutative  monoids  throughout  this  paper,  we  shall 
just  write  "monoid"  instead  of  "commutative  monoid".  Examples  of  commutative 
monoids  are 


1) 

(N,+), 

where  N  = 

{0,1,2,...}. 

Here  0  is 

the  unit. 

2) 

(N,v), 

where  xvy 

=  max{x,y}. 

The  unit  is 

0. 

3) 

(Nu{°°} 

,  a)  ,  where 

xAy  =  min{x,y} .  The  unit 

is  oo. 

h) 

(R+,  • 

) ,  where  R 

is  the  set 

of  nonnegative  real  numbers 

The  unit  is  1 . 


Now,  let  (M,*)  be  a  monoid.  Consider  the  set  M  consisting  of  all 
homomorphisms  £:(M,*)  -*■  (R+,*),  i.e.,  satisfying  for  all  a,b£M 

?(a)  C(b)  =  £(a*b),  £(e)  =  1,  (2.l) 

A 

where  e  is  the  unit  in  (M,*).  If  M,  the  mapping  £  defined 

t>y 

?!*C2(a)  =  E^Ca)  C2(a)  (2.2) 

A  A 

is  obviously  in  M  and  it  is  a  trivial  exercise  to  verify  that  (M, • )  is  a 
monoid  with  the  unit  being  £e ,  defined  as 
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C0(a)  =  1  for  all  aeM  .  (2.3) 

(M,*)  shall  be  called  the  dual  monoid  to  (M,*) . 

If  ve  have  two  monoids  (M,*)  and  (N,°)  we  can  form  the  product 
of  these 

(M,*)  x  (N,o)  =  (M  x  Njp),  (2.4 ) 

where 

(m1,n1)  $  (mg,^)  =  (n^  *  nl  *  ^  *  (2.5) 

This  is  again  a  monoid,  the  unit  being  (eM,  eN) ,  where  eM  and  e^  are 
the  units  in  M  and  N  respectively .  The  dual  to  a  product  can  easily  be 
obtained  from  the  duals  to  the  elements  in  the  product: 

Proposition  2.1:  The  homomorphisms  of  (M  x  n,$)  in-fc0  (R+,*)  are  exactly 
those  of  the  form 

£(m,n)  =  SM(m)  CN(n)  , 

where  CM  €M  and  £  6  N  . 

Proof:  The  equation 

€(“l  *  m2’  ni  °  n2)  =  nx)  5(^5  n2)  (2.6) 

gives  for  =  m,  1112  =  eM,  n2  =  n,  n±  =  eK 

£(m,n)  =  ?(m,eN)  •  ?(eM,n)  .  (2.7) 

Now  (2.6)  for  =  n2  =  eu  gives 


*  111 2 »  eN>  =  eN*  ^(m2»  eN} 


(2.8) 
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which  means  that  £(•,  eK)  must  be  in  M.  Analogously  one  gets  that  £(e  •) 

A 

must  be  in  N.  So  all  homomorphisms  must  be  of  the  form 

?(m,  n)  =  ^(n)  .  (2 .9) 

A 

It  is  easy  to  see  that  functions  of  this  form  are  in  M  x  u  End.  Qf  proof. 

As  mentioned  in  the  introduction,  the  support  of  the  measures  in  the 
families  we  consider  may  very  often  vary  with  the  parameter.  This  will  of 
course  not  be  in  a  completely  arbitrary  fashion  but  in  a  fashion  compatible 
with  the  algebraic  structure  of  the  sufficient  statistics.  To  investigate 
this  aspect,  the  following  concept  will  be  of  relevance: 

Definition  2.2  F  C  m  is  said  to  be  a  face  of  M  if 

i)  F  is  a  submonoid  of  M  and 
ii)  c€F  A  c  =  a*b  =>  a€F  A  beF 

The  faces  of  M  are  exactly  the  possible  positivity  regions  for  elements  in 

A. 

M: 

Proposition  2.2:  Let  F  c  M  F  is  a  face  of  M  iff  there  is  a  5eM  so  that 

F  =  {aeM:  5(a)  >  0}  . 

Proof:  If  F  =  {a£M:  5(a)  >  0}  for  some  £eM,  then 

a€FA  beF  =>  5(a»b)  =  5(a)  5(b)  >  0  ,  (2.10) 

and  as  ?(e)  =  1,  F  is  a  submonoid  of  M.  If  ceF  and  c  =  a»b ,  then 

0  <  £(a#b)  =  £(a)  (2.11) 

and  hence  5(a)  and  5  (b )  both  m\ast  be  positive,  i.e.,  aeF  and  beF. 
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If  on  the  other  hand  F  is  a  face  of  M,  we  can  define 

( 1  if  a£F 


£(a)  =  < 


0  otherwise  . 


K  is  easily  seen  to  "be  a  homomorphism  and  the  result  is  proved. 

Proposition  2.3:  M  is  a  face  of  M 
The  proof  is  obvious. 

Proposition  2. It.  If  (F  )  is  a  family  of  faces  of  M,  then 
is  a  face  of  M. 


(2.12) 


We  also  have 


p=n 

iel 


F. 

l 


Proof:  Immediate  from  the  definition. 

Remark:  From  propositions  2.3  and  2.b  it  follows  that  for  any  a€M  there  is 
a  unique  smallest  face  of  M,  F(a)  so  that  af-F(a). 

Propositions  2.1  and  2.2  enables  us  to  establish  a  result  about  faces  of 
product  monoids: 

Proposition  2.5:  F  is  a  face  of  M  x  I  iff  F  =  F„  x  f„.  where  F.  and 

-  -  -  jyj  JJ’  -  J/[  - 

Fjj  are  faces  of  respectively  M  and  N. 

Proof:  According  to  proposition  2.1,  all  homomorphisms  £  of  (M  x  N,®) 
into  (R+,»)  are  of  the  form 


£(m,n)  =  ?M(n)  ^(n)  , 

A  /N 

where  and  £N€N.  Wow 


(2.13) 


(m,n):  £(m,n)  >  0 
|(m,n):  CM(m)  >  OA^(n)  >  o| 

=  jm:  ?M(m)  >  0 j  x  jn:  ^(n)  >  o|  .  (2.1k) 

The  proposition  2.2  and  equation  (2.lU)  together  yields  the  result. 

2.1:  Let  us  consider  the  monoid  (N,  +  ).  A  homomorphism  must  satisfy 
£(0)  =  1.  Let  £(l)  =  0,  some  non-negative  real  number.  \fe  must  have 

C(n)  =  £(l)n  =  0n  .  (2.15) 

It  follows  that  the  only  faces  of  (N,+)  are  {0}  and  N. 

Example  2.2  Let  us  consider  (N,v).  Let  n£H  be  fixed.  The  smallest  face 
containing  n  must  contain  all  integers  less  than  or  equal  to  n  as 

nvx  =  nifx£n.  (2.16 ) 

On  the  other  hand,  {0,...,n}  is  obviously  a  face  of  (N,v).  Hence  all  faces 
°f  (N,v)  are  N  itself  and  subsets  of  the  form  {0,...,n}  for  some  neH. 

A 

How  let  ^£N  be  positive  exactly  on  {0,...,n};  it  must  satisfy 

C(xvn)  =  £(x)  £(n)  =  £(n)  for  x  £n.  (2.17) 

As  £(n)  is  strictly  positive,  we  get 


?(x)  =  1  for  x  <  n 


(2.18) 
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and  hence  that  the  only  homomorphisms  of  (N,+  )  are  indicator  functions  of 
faces, 

”  X{0 . n}  <*>  <2'19) 

for  some  n£  N  U{°°}  . 

Example*  2.3:  If  we  now  form  the  product  (N,v  )  x  (N,+)  it  follows  from 
proposition  2.1  that  all  homomorphisms  are  of  the  form 

5<^>  =  *{0,...,n>‘*>-ey  l2'20> 

for  some  n€  NU  {°°}  and  0  >_0.  From  proposition  2.5,  we  get  that  the  faces 
of  this  monoid  are 

{0,  ...,n}  x  H,  n€  N 

{0,  . . .  ,n}  x  {o},  n£  W 
N  x  N  . 

3.  General  Exponential  Models 

In  the  following  we  shall  consider  an  at  most  denumerable  set  E,  a 
monoid  (M,*)  and  a  mapping  t:  E  -*■  M.  Vfe  shall  think  of  E  as  the  sample 
space  and  of  t  as  a  sufficient  statistic.  Let  =  t(E)  and  define 
recursively 

M  =  M,  *  M  ,,  for  n  =  2,3,...  .  (3.1) 

n  ±  n-1 

This  is  done  for  the  following  reason:  if  we  make  n  independent  observa¬ 
tions  of  a  random  variable  on  E,  we  shall  assume  that  the  sufficient  statistic 


will  be 
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(x^,...,xn)  =  t (x1)  t(x^)  »  (3.2) 

and  hence  M  =  t^n^  (En )  . 
n 

Tfe  shall  assume  that  we  can  infer  the  size  of  the  experiment  from  the 
statistic,  or,  in  other  words,  that 


M  =  0,  whenever  n  ^  m  (3.3) 

m  n 


For  convenience  we  don't  want  M  to  he  bigger  than  necessary,  hence 
we  assume  that 


(3.4) 


where  M  satisfies 
o 


NL  *  M  =  M  for  all  n  £  N 
0  n  n 


(3.5) 


Let  V  be  a  a-finite  measure  on  E  so  that  v(x)  is  positive  for  all 


xeE.  Let  denote  the  normalized  dual  to  M: 


Mv  =  < 


V(x)  £(t(x))  =  1 


xsE 


(3.6) 


and  assume  that  is  non-empty. 

A  statistical  model  for  a  random  variable  X  taking  values  in  E  is  a 
family  9  of  probability  measures  on  E. 

Definition  3.1:  9  is  said  to  be  a  general  exponential  model  if  there 

exists  M,  t  and  v  as  above,  so  that 


Let  us  first  see  in  what  sense  this  looks  like  a  "classical"  exponential 


model.  Suppose  we  have  observed  n  independent  random  variables  from  the 
above  distribution.  The  joint  probability  function  is 


(x  ,  ...,x  )  =(  n  v(x.  ))  •  £(t  (x  )  )*  •  *£(t  (x  ) ) 
t,  ±  n  \i=l  1  /  -1  n 

=  (  II  v(x.)j  5 (t(x  )*•••#  t(x  ))  (3.7) 

\i=l  1  /  -1-  n 

as  M .  If  we  compare  (3.7)  with  (l.2)  in  the  introduction,  we  note,  that 
V  plays  the  same  role  as  b,  the  common  reference  measure.  The  statistic 
t  corresponds  to  (t  ,  .  . .  ,t,  ,n),  i.e.  the  sufficient  statistic  plus  a 

1  K 

"counting  variable"  indicating  the  size  of  the  experiment.  corresponds 

to  the  function 


k 


e.t. 

i  i 


(tx, . .  .,tk,n)  -*  a(0)n  e1  1  (3.8) 

so  the  normalizing  constant  a(0  )  is  taken  into  £  and  the  experiment  size 
into  the  statistic  t. 

The  above  defined  models  differ  from  the  exponential  models  in  several 
respects.  First  the  range  space  of  the  statistic  is  a  monoid  instead  of  a 
subset  of  a  vector  space,  the  parameter  space  is  the  normalized  dual  of  this 
monoid  instead  of  a  subset  of  a  vector  space,  and  there  is  no  assumption  of 
anything  like  finite  dimension.  Furthermore  we  shall  see  that  in  general 
the  support  of  the  measures  in  the  family  will  depend  on  the  parameter  £, 
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as  these  will  not  always  he  positive.  As  derived  in  the  previous  sections, 
the  possible  positivity  regions  for  £  will  be  the  faces  of  the  monoid  (M,*). 
From  (3.7)  and  the  Neyman  factorization  theorem  it  immediately  follows  that 

t(n)  (X1,...,Xn)  =  t(X1)*---*  t(XQ)  (3.9) 

is  sufficient  for  the  parameter  £  from  observation  of  X  . X 

I  n 

The  relation  between  these  models  and  the  classical  exponential  models 
should  hopefully  be  more  apparent  from  the  examples  below. 

Example  3.1  (The  Bernoulli  Distribution) 

Let  E  =  {0,l}  and  v(0)  =  v(l)  =  1.  Let 

(M,*)  =  (N,+)  x  (N,+)  .  (3.10) 

Ife  have 

OO 

M=LJMn>  where  M  =  {(x,y):  x+y  =  n}  .  (3.1l) 

n=0  n 

Let  t(l)  =  (l,0 )  and  t(0)  =  (0,1 ).  The  elements  of  M  are  all  of  the 
form 

F6,ri  ^x,y)  =  0Xriy’  9  i°»  E  >  0.  (3.12) 

Ife  immediately  get  that 

^  10  0  1 
Fe  n€  mv  <=>  8  n  +  eun  =  1 

<=>  t)  =  i  -  e  .  (3.13) 


Bence,  the  model 


-lU- 


0  if  x  =  1 


PQ  {X  =  x}  =  V (x)  F0 }1_0 (t (x) ) 


< 

1-0  if  x  =  0 


(3.1U) 


where  0  £  0  <_  1,  is  a  general  exponential  model.  The  difference  between 

this  model  and  the  classical  exponential  family  version  of  the  Bernoulli 
distribution  is  that  0=0  and  0=1  are  included  in  the  model. 

Example  3.2  (The  Poisson  distribution). 

Let  E  =  N  and  v(x)  =  -y-  •  Let 

*v  X  • 

(M,*)  =  (N,+)  x  (N,+).  (3.15) 

We  have 

00 

M  =  Mn,  where  Mn  =  {(x,y);  y  =  n}.  (3.16) 

n=0 

Let  t(x)  =  (x,l).  We  get 


F„  t  M  <=>^T' 

0,n  v 

x=0 


<=>  p 


-0 

e 


(3. IT) 


Hence,  the  model  . 

PQ{X  =  x>  =  v(x)  P6je-0  (t(x))  =  e  9  ,  (3.18) 

where  0  >_  0  is  a  general  exponential  model.  Again  the  inclusion  of  0=0 
is  the  only  difference  between  this  and  the  classical  approach. 
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So  far,  the  examples  considered  have  basically  been  exponential  models  in 
the  classical  sense  apart  from  adding  some  degenerate  distributions.  The 
following  examples  show  that  the  models  in  fact  can  be  quite  different  from 
the  classical  exponential  models. 

Example  3 . 3  (The  uniform  distribution). 

Let  E  =  {l,2,...}  and  v(x)  =  1  for  all  xt  E.  Let 

(M,*)  =  (E, v )  x  (N,+  )  . 

We  have 

00 

M  =  \^J  Mn>  where  Mn  =  {(x,y):  y  =  n} 
n=0 

Let  t(x)  =  (x,l).  The  elements  of  M  are  all  of  the  form 

pe,n  (x-y)  =  x{i . 6}  (x)r,y-  e'E’  i  i°  •  (3.21) 

Wfe  have 

oo 

F0,ri<  “v  ~  "  X{1 . 0)(x)  '  1 

<=>  n  =  .  (3.22) 

Hence,  the  model 


(3.19) 


(3.20) 


P 


0 


{X  =  x}  =  v(x) 


F 


(t(x) ) 


1  # 

0  *  X{1,...,0} 


(x)  , 


where  0  =  1,2,...,  is  a  general  exponential  model. 


(3.23) 
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Combining  examples  3.1  -  3.3  we  take  the  following: 

Example  3.4  (Doubly  truncated  geometric  distribution  with  unknown 
points).  Let  E  =  N  and  v(x)  =  1  for  all  xeE.  As  our  monoid 
choose  the  submonoid  of 


(M'»*)  =  (N,  v )  x  (jj  U{°°} ,  a.)  x  (K,+  )  x  (N,+  )  , 


given  by  M  =  M^, 

n=0 


where 


M0  =  { (0 ,°°,0 ,  0  ) }  ,  Mx  =  {  (x,x,x,l) :  xe 


and  is  recursively  defined  as 


M  =  M.  *  M  ,  for  n 
n  1  n-1 


“  2 , 3 , . . 


Let  t (x )  =  (x,x,x,l).  The  elements  of  M  are  all  of  the  form 


Fe,n,X,»  (x’y’z’n>  = 


*{0 . 0}(x)  x{n . »)(y)  W<1 


where  0e  N,  r]e  N  U{“},  X  ^  0  and  (i  >  0  , 

>fe  get 


FQ  a  « 

0,ri,X,y 


Mv  <=> 


,  x 
X  y 


=  1 


x=n 


<=>  ri  0,  X  =  1 


and 


1=  0  _n  +  i 

y 


x0+1-xn 

X-l 


truncation 
(M,  * )  we 

(3.2H) 


(3.25) 

(3.26) 


(3. 27) 


or  ri  <_  0 ,  X  4  1  and 


(3.28) 
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Henee  the  model 


pe,n,x  {x  -  x)  =  4><A> 


(3.29) 


where  A  >_  0,  0  _>  Ls  9,  r|  tN  and 


4>U) 


e-n+i 

i-i 

l  xe+lAn 


if  1  =  1 

if  1^1 


(3.30) 


is  a  general  exponential  model . 

Finally  we  shall  consider  an  example,  where  the  general  exponential  model 
is  different  from  a  classical  one  in  the  sense  of  infinite-dimensionality  of 
the  parameter  space. 


Example  3.5  (The  completely  free  distribution).  Let  E  be  any  denumerable 
set,  and  v(x)  =  1  for  all  xeE.  Let  (M,*)  =  N,  consisting  of  all  mappings 
f  from  E  to  N  with  finite  support,  i.e.,  where  {x:  f(x)  4-  0}  is  finite. 
The  composition  rule  is  pointwise  addition 


(f*g)  (x)  =  f(x)  +  g(x)  .  (3.31) 

have  the  partitioning  of  M  =  where 

Mn  =|f€  EN:  (x)  =  n|  .  (3.32) 

X  €  E 

If  we  let  t(x)  =  X{xp  we  can  see,  that  the  sufficient  reduction  of  a  sample 

of  size  n  becomes  the  "frequency  table",  i.e.,  t^  (x^,...,xn)  is  the 
E 

function  in  N ,  having  the  value  n  in  x  iff  x  occurs  exactly  n  times 

~  x  x 

in  the  sample  (x^,...,x  ).  M  consists  of  the  elements 


-18- 


gfl  (f)  =/7  0(x) 

9  x€E 


f(x) 


(3.33) 


•where  0  is  any  mapping  from  E  into  the  non-negative  real  numbers.  We  have 


(n  e(y)x“(y)’ 

xfeE  ^£E 


=  1 


<=>E 0(x)  =  1  * 

xeE 


(3.3V) 


Hence,  the  model 


PQ  {X  =  x}  =  v(x)  g0  (X{x})  =  6(x)  , 


(3.35) 


where  0  satisfies 


0(x)  _>  0  for  all  x€E  and 

E  0(x) =  -1 

x€E 


(3.36) 


is  a  general  exponential  model.  Other  examples  could  be  generated  ad  libitum. 


U.  Estimation  in  general  exponential  models 

We  shall  consider  the  following  estimation  problem: 

Let  X^, . ,.,Xn  be  independent  and  identically  distributed  on  E  with 


{X  =  x}  =  V (x)  5(t(x))  , 


(U.l) 


where  V  and  t  are  known  and  as  in  the  previous  section  and 
unknown.  Our  sample  space  is  En,  the  parameter  space  is 
hood  function  becomes 


A 

£  t  M  is 
v 

and  the  likeli- 


L(x^, • • *»xn>  C  )  =  [J  ^(X£  ) 

u  x=l 


c(t(x  )*•••*  t  (x  )  )  . 
1  n 


(b.2) 


-19- 


As  mentioned  earlier,  t ^  given  by 


(n) 


t v  (x  ,  . .  .  ,x  )  =  t  (x,  )#  •  •  *#t  (x  ) 
in  x  n 


(4.3) 


is  sufficient  for  £  and  £  is  clearly  a  maximum  likelihood  estimator  of 
£  iff 


C(tQ)  =  sup  £(tQ)  , 

£€M, 


(4.4) 


where  t  =  t (x.  )#• • **t (x  ) 
o  ±  n 

In  the  following  we  shall  establish  the  existence  and  uniqueness  of  £ 

for  any  n  and  xn,...,x  . 

I  n 

First  we  prove  a  lemma: 

Lemma  4.1  Let  M*  =  IxeE  v(x)  £(t(x))  <_  l|  .  M*  is  compact  in  the 

pointwise  topology. 

Proof:  Let  £^,  £g,  . .  .  be  a  sequence  of  elements  in  M*.  As  [0,  °°]  is 

compact,  we  can  always  find  a  subsequence  £  ,  £  , ...  so  that  for  any  s € M, 

1  n2 

?n  {s)  i^(s)  ’  (4>5) 

i 

where  0  i  C  (s)  <  °°. 

^  M 

Wa  have  to  show  that  this  limit  £  in  fact  is  an  element  of  M  . 

V 

From  Fatoufe  lemma,  we  get 


7>(x)  £(t  (x)) 
x  €  E 


< 


lim  inf 

j_-KO 


V  v(x)  £n  (t(x))  <  1 
xe  E  1 


(4.6) 
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We  shall  now  just  prove  that  £(s)  <  +  00  and  that 

£(s  *  t)  =  £(sj  £(t)  .  (4.7) 

But  as  t(E)  =  M  and  v(x)  >  0  for  all  x£.E,  (4.6)  gives  that  £(s)  <  +  00 
for  all  sgM  .  Now,  if  seM^  where  M1  appears  n  times, 

we  have 

£(s)  =  lim  £  (s)  =  lim  (£  (s, )*••£  (s  ))  =  £(s  ,)*“£(s  )  ,  (4.8) 

n.  .  ii .  x  ii .  ii  x  n 

l-*CO  1  1 

where  s,,...,s  6  M,  and  s,#**-#s  =  s.  This  gives  that  £(s)  <  °°  for  all 

1*  n  1  n 

s  £  M  since 

£(e)  =  lim  £  (e)  =  1,  (4.9) 

i-*“  i 


and  also  that  £  6  M.  The  lemma  is  proved. 

We  can  now  show  the  existence  of  the  maximum  likelihood  estimate  for  any 


Proposition  4.1:  For  all  s  € M,  there  is  a  £fcM  ,  so  that  £(s)  -  sup  £(s). 

.  Et  mv 


Proof: 

/V  u 

As  M*  is  compact  and  the  mapping  £  -►  £(s)  is  continuous,  there  is  a  £  £My 
s  o  that 


£*(s)  =  sup  £(s) 


(4.io) 


But  if 


y]  v(x)  £*  ( t (x ) )  =  c  <  1 

X?E 


(4.11) 
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then  M  -*■  R+  defined  as 

C(s)  =  e*(s)  ^  for  s  €Mn  (U .12 ) 

*  *  ^  ^  ^  ^ 

is  in  Mv  and  ?(s)  >  E,  (s),  which  is  a  contradiction.  Hence  we  must  have 
c  =  1,  f(s)  =  £*(s),  ^€Mv  and 

t(s)  =  sup  g(s)  ,  (4.13) 

which  was  to  he  proved. 

Next  we  prove  the  uniqueness  of  the  maximum  likelihood  estimate. 
Proposition  4.2:  If  ^(sq)  =  t,(s  )  =  sup  ?(s  )  ,  then  l  =  L. 

Proof:  For  sf  M  let 

£(s)  =  y^Cs)  C2(s)  .  (4.14) 

Define  M  ->■  R  by 


t(s)  = 

1 

e(s) 

for  s  . 

fj^v(x)  £(t(x))\k 
(xeE  / 

Obviously  f  £Mv.  If  £  =  f 

for  all  s  e  Mp,  £ 

=  for  all 

Cauchy— Schwarz  inequality  gives 

E  v(x)  5(tu»  <  (  Y.  v(x)  i(t(*»)  /  £  »Hl(tW))=i, 

x£E  \x«E  /  \xeE  / 


(4.16) 
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as  and  %  are  in  M^.  We  therefore  have 


4  %  =>  Yj  v(x)  ?(t(x))  <  1. 
x£E 


(4.17) 


But  then 


E(sq)  i^so^  ^2  ~  ^2  ^s0  ^  ’  (4.18) 


which  is  a  contradiction.  Hence  ^  =  Cg  =  f  which  was  to  he  proved. 

The  next  result  giving  some  more  detailed  information  about  the  maximum 
likelihood  estimate  should  be  compared  to  the  results  in  section  7,  part  I 
of  Barndorff-Nielsen  (1973). 

Proposition  4.3:  The  positivity  region  of  £  where  £(sQ)  =  sup  £(sQ)  is 


exactly  the  face  F(sq). 

✓s 

Proof:  As  £(s  )  >  0,  we  have  from  proposition  2.2  that 


£6M 


v 


M+(i)  =  {seM:  £(s)  >  0}3F(sq) 


(4.19) 


Suppose  that  M+(|)  i  F(sQ),  i-e.,  there  is  an  s'  in  M+(£)\F(sq).  Then 
s'£Mn  for  some  n  and  s'  =  s1  *s2*-*-*Sn.  As  M+(£)  is  a  face,  we  then 


have 

s1,  .  .  .  ,sn£  M+(C)D  M1  . 

At  least  one  of  them,  say  s^,  must  be  outside  f(sq)  since  F(Sq)  is 
submonoid  and  the  sum  is  outside  F(s  ).  How  let 


(4.20) 


(s )  for  seF(s  ) 
C*  (s)  =]  0 

(O  otherwise 


(4.21) 
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and  define  M  -*■  R+  by 


C(s) 


€*(8) 

N 

v(x)c  (t  (x)) 


for 


s  £M  . 
n 


A 

Clearly  and 


2v(x)  £’(t(x))  <  1. 
x£E 


( 4.22 ) 


(4.23) 


Therefore  £(sQ)  >  t(sQ), 


which  is  a  contradiction.  Hence  we  must  have 


M+'(i)  =  F(s0)  , 


(4.24) 


which  was  to  be  proved. 

So,  the  support  of  the  estimated  measure  is  closely  tied  to  the  way  the 
observations  can  occur.  If  sQ  is  observed  after  n  experiments,  t(x^)  must 
be  in  the  smallest  face  containing  sQ  for  all  i  =  l,...,n  as 


sQ  =  t(x1)if*  •  •*t(xn)  .  (4.25) 

The  estimate  contains  this  information  as  the  support  of  is  reduced  to  the 

i 

subset  of  E,  where  t(x)€F(sQ). 

5 .  Extreme  Families  of  Random  Walks  on  Monoids 

First  we  introduce  the  definition  of  an  extreme  family  of  Markov  chains  as 
given  by  Lauritzen  (1974). 

Let  (E^,  n=l,2,...)  be  a  family  of  discrete,  at  most  denumerable  spaces 

and  Q  =  (Q  )  a  family  of  matrices  with  elements  q  (x,y),  x  eE  ,  y  6E  , 
mn  .  J  ^mn  ’  m’  J  n’ 

m<n 

satisfying 

Vi(x’y)  =  °»  S  V(x’y)  =  1 

x  £  E 

m 

and 

V  «np  =  %  for  m  <  n  <  p  . 
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jn(Q)  denotes  the  set  of  sequences  of  probability  measures  y  =  (y  ,  n  =  1,2,...) 
so  that  yn  is  a  probability  measure  on  E^  and 

Vi  =  V>  for  all  m  <_  n.  (5.2) 

in  mil  n  — 

TTl (q)  is  a  convex  set  and  §(Q)  shall  mean  the  extreme  points  of  VTV(Q).  The 

OO 

family  of  Markov  chains  on  II  E  defined  by  the  initial  distributions 
J  n=l  n 

Py  {Xl  =  x}  =  y^x)  (5.3) 

and  the  transition  probabilities  for  m  ^  n 

vn(y) 

'W(x’y>  Ftry  for  Pm(x)  4  0 

m 

(5.4) 

y  (y)  otherwise 

n 

where  y  takes  all  values  in  S(Q),  is  called  the  extreme  family  generated  by 

Q. 

For  all  y€7Tl(Q),  the  matrices  define  the  "backward  conditional 

probabilities",  i.e. 

Py  {Xm  =  x|Xn  =  y}  =  (x,y)  for  m  £  n  (5-5) 

and  y^  the  marginal  distributions  of  X^,  i.e. 

Py  {X  =  x}  =  y  (x)  (5.6) 

n  n 

CO 

Now  consider  the  sequence  of  spaces  M  ,  n=l,2,...  where  M  =  Un_Q  M^ 
corresponding  to  a  general  exponential  model.  Let 

Y  =  t(X1)*-*-#t(X  ), 

n  1  n 


(5.7) 
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where  X^,...,X  are  independent  and  identically  distributed  as 


jx  =  xj-  =  v(x)  £(t(x))  ,  (5.8) 

where  C€M^  is  unknown.  Let 

a(s)  = 

x6E:t(  x)=s 


I 


v(x) 


(5.9) 


t  \  #n 

We  have  a(s)  >  0  for  all  s£M^.  Define  the  n'th  convolution  a  of 
a  as  a  x(s)  =  a(s)  and 


g  n(s)  =  £  a(a)  a*  (b)  for  m=2,3,... 

a*b=s 


(5.10) 


life  have  a*n  (s  )  >  0  for  all  s  £  . 

CO 

Yn ,  Y  , .  .  .  forms  a  Markov  chain  on  II  n  M  and  we  have  for  m  <  n  and 
1  2  n=l  n  .  = 

P?  {\=y}  >  o 

PM  =y|Y  =x}*Pr{Y  =x} 

Pr  {Y  =  x  Y  =  y}  =  - 

Cm  1  n  P^Yn=y} 

(  £  a»(n-m)  (a)  E(a)\  0«m(l)  c(x) 

\a:  a*x=y _ / _ 

o*n  (y )  C(y) 


_  a  m(x)  X*  a#(n-m) 

I 


g  n(y)  a:a*x=y 


(a) 


(5.11) 


We  shall  now  consider  the  system  of  backward  conditional  distributions 
( Snn  )m<n  =  ^  with  elements  qnm^X,y^  x€-Mm’yeMn  given  by 


g  (y)  a:  a*x=y 


(5.12) 


We  shall  find  §(Q)  and  in  fact  show  that 
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G(Q) 


4=3 


£  €M. 


v 


y  (x)  = 
rn 


a*n(x)  C(x)j 


(5.13) 


i.e.  exactly  the  family  of  Markov  chains  made  up  by  sequences  of  sufficient 
statistics  from  successive  repetitions  of  experiments  giving  rise  to  random 
variables  following  a  general  exponential  model. 

First  we  need  a  lemma.  For  y  =  (yn»n=l,2, . . . )e  JH(q) ,  x  6M^,  k=l,2,... 
define  the  sequence  T  ,  y  by 

X.  ™  xv 


T  ,  y  (a)  = 
x,k  n 


Vk(a*x) 

Vx) 


q*k(a)  ot*n(x) 
q‘(n+k)(a,x) 


if 


Vx) 


>  0 


y  (a) 

’ll 


otherwise 


(5.1*0 


Lemma  *+.  1  y  £7Tl(Q)  =>  T  ,  ufJll(Q)  . 

‘  1  1  _I  ~ T—  Xj  K. 

Proof: 

Clearly,  if  y^x)  =  0,  Tx  k  V1  =  h  and  hence  Tx  ycm(Q). 
If  y^x)  ^  0,  we  have 


2 
a  e  M 


x,k 


Pn(a) 


VX' 


2  2 
b€-M  t.  a :  a*x=b 
n+k 


q*k(x)  a*n(a) 
a*(n+k)(b) 


y 


n+k 


(b) 


\<x) 


2  Vn+k(x-b>  ‘Wb)  ■ 


b  6M 


n+k 


(5.15) 


as  y  was  known  to  be  in  ?Tl(  Q ) . 
Further,  we  get 

2  V(a-b)  L.k  Vb> 


b  €M 


n 


■22s 


-(a)  a*(n~m)  (c)  Vk(b,x)  e.*k(x) 


b€M  c:c#a=b 
n 


»n/.  \ 
q  (b) 


Vx)  q*(n+k)(b«x) 
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*m  /  \  *k  /  \ 
a  (a)  a  (x)  1 


g*(m+k)(a*x)  \{x) 


I  I 

b  €M  c:c#a=b 
n 


g*(m4k>(a»x)  c.*(n-m) 

a“(n+k){b,x) 


M 


Vk(b,x) 


(5.16) 


Wow 


{c:c*a  =  b}  <2(c:c*a*x  =  b*x) 


(5.17) 


and 


{b*x:  x6M  }  c? M  ,  , 
n  —  n+k 


(5.18) 


so  we  have  the  inequality 


Z,  q^ts.b)  t  k  yn(b) 

b  €  M  * 

n 


g*lllrik)(a»x)  n*(n-m)(c) 


.  a*m(a)  g»k(x)  1  Y  Y  _ 

"g*(lntk)<a,x)  Vx)  2a  g*(n+k)(d) 

d  tMn+k  c: c*(a*x)=d  v ' 


yn+k^ 


*Di/  x  *k  /  \ 

a  (a)  a  (x)  l 


g*(m+k)(a,x)  \(x) 


I 


1  .i  . -i  (a*x,d)  u  „  (d) 
m+k,  n+k  *  n+k 


d  €  M 


n+k 


Tx,k  bm(a)  ■ 


(5.19) 


or  in  short. 


>  I 


V,Wi  f  V(a-b>  Tx,k  bn(b) 

D  M 
n 


(5.20) 


But  by  (5.15),  both  sides  of  (5.20)  add  up  to  one  when  summing  over  aeM  , 

m 

and  hence  we  must  have  equality  and  therefore  T  ye)W(Q),  which  was  to 

X9  K 


be  proved. 
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Proposition  4.1 

yefb(Q)  <=>  3  ?  £MV‘.  yn(x)  =  a*n(x)  £(x)  . 

Or,  in  words,  the  extreme  family  generated  by  Q  consist  of  "random  walks", 

Yn  =  ^xn)  (5.21) 

where  the  t(X)'s  are  independent  and  identically  distributed  with  the  distri¬ 
bution  of  =  t(X^)  given  by 

y (x)  =  a(x)  £(x)  for  some  £  .  (5.22) 

Proof:  The  proof  consists  of  the  following  steps.  First  we  use  lemma  4.1  to 

obtain  a  representation  of  any  y€TTl(Q)  as  a  convex  combination  of  other  elements 
(T  .  y)  in  7R(Q).  If  y  then  is  extreme,  y  must  be  equal  to  these  other 

X5  K 

elements,  which  gives  us  an  equation.  This  equation  is  essentially  the 
homomorphism  equation  and  we  can  then  establish  "=>".  To  prove  "<=="  we 
show  that  a  proper  mixture  of  homomorphisms ,  cannot  be  a  homomorphism. 

Suppose  now  that  y£7Tl(Q)  is  extreme,  i.e.  yeg(Q).  We  note  that  the 
equation  for  yOl(Q). 


'.'•’-1  2.5 


*n/  \  #k/  \ 

a  (a)  a  (c ) 


*  (n+k ) 


b  ,,  c  :  c*a=b 
n+k 


(a*c ) 


Vk  (a,c) 


(5.23) 


implies  that 


yn(a)  =  0=>yn+k  (a*c)  =  0  , 


(5.24) 


as 


q^  n+k(a,b)  >  0  for  all  b  eMn+k  Ifence  (5.23)  can  be  rearranged  to 


yn(a)  =  X  2  Vc>  '  Tc,k  ynU) 

b£M  c:c*a=b 
n+k 


(5.25) 
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This  gives  y  as  a  convex  combination  of  Tc  ^  y,  ctM^  for  all  k  =  1,2,... 
and  as  y  was  supposed  to  be  extreme. 


yn(a)  =  Tc  k  Pn(a)  for  all  k  =  1,2,...  and  cel^.  (5.26) 
Thus,  for  all  c  eM^,  k  =  1,2,...  so  that  y^c)  >  0,  we  must  have 


pn+k(a*0>  wn(a)  \(<=) 


#  (n+k )  /  T  #n ,  \  *k  ,  s 
a  (a#c)  a  (a)  a  (c) 


(5.27) 


If  we  let 


y  (a) 

h  (a)  =  ~ ~ 


n 


*n/  \ 
a  (a) 


(5.28) 


(5  .27 )  becomes 


hn+k^a*C^  =  hn^a^ 


(5.29) 


But  as 


l-^c )  =  0  =>  yn+k(a*c)  =  0 

(5.29)  must  hold  for  all  n,  k,  a£M  ,  c  eM.  .  As  M  DM 

-  n’  k  n  ] 

we  can  define  a  mapping  E,  from  M  to  R+  by 


£(a)  =  h  (a)  for  a£M  , 
n  n 


(5.30) 
for  m  i’  n, 

(5.31) 


and  by  (5.29)  £  £M. 

If  y  is  extreme,  we  then  have 


y  (a) 

- - =  £ (a)  <=>  y (a)  =  a*n(a)  £(a). 


*n,  s 

a  (a) 


n 


(5.32) 
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As  y  is  a  probability,  we  have 

Xyi/a)  "I><a>  ^(a)  =  1  »  (5.33) 

a  a 

i.e.,  that  E  £M^.  We  have  proved 
Now  suppose  that 

yn°(x)  =  a*n(x)  5Q(x)  for  some  £q£Mv  .  (5.34). 

All  elements  in  'JTl(Q)  are  mixtures  of  the  elements  in  g(Q)  .  It  follows 
from  what  we  proved  before  that  the  set  of  sequences 

?eMv>  »  (5.35) 

where 

y^(x)  =  a*n(x)  £(x)  ,  (5.36) 

contains  g(Q).  A  forteriori  any  y  €JTl(Q)  can  be  represented  as  a  mixture  of 

EQ 

elements  of  the  form  (5-36).  This  is  in  particular  true  for  y  .  Hence, 

A 

there  is  a  probability  measure  P  on  so  that  for  all  xeM 

a*n(x)  E  (x)  =  f  a*n(x)  £(x)  d  P(£)  .  (5. .37) 

°  J  A. 

E  eM 

v 

But  (5. 37)  is  equivalent  to 

?Q(x)  -J  £(x)  1  P(S)  for  all  x  6  M  (5.38) 

Using  the  homomorphism  property,  we  have 
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(  X  ^U)  d  pU)V  =  (Cn(x))2  =  E  (x  *  x) 

\£€Mv  /  ° 

=  ^(x  *  x)  d  P(Q  =  (£(x))2  d  P(£)  (5.39) 

But  (5*39)  implies  that  P{^  }  =  1  and  hence  that  E  is  extreme.  The 

o  o 

proof  is  complete. 

6 .  Additional  Comments 

The  families  defined  in  the  present  paper  are  sometimes  identical  to  the 
completion  of  a  regular  canonical  exponential  family  as  defined  by  Barndorff- 
Nielsen  (1973) . 

Let  T  he  a  finite  subset  of  the  k-dimensional  integer  lattice  let 

T  =  {0}  and  define  T  by 
o  n 

Tx  =  T  and  T^  -  T  +  T^  for  n  =  2,3,...  (6.1) 

Let  (M, #)  be  the  monoid 

M=  {(t,n) :  t€Tn,  n<?N}  ,  (6.2) 

with  the  composition 

(s,  m)  #  (t,  n)  =  (s  +  t,  m  +  n)  .  (6.3) 

If  we  have  a  general  exponential  model  on  a  space  E  with  t(x)  =  (g(x),  l), 
where  g  is  a  mapping  from  E  onto  T,  this  model  can  be  identified  with 
the  completion  of  the  canonical  exponential  family  generated  by  v  and  g 
in  exactly  the  same  way  as  in  Martin-Lof  (1973) •  This  situation  is  present  in 
example  3.1  of  the  present  paper. 
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If  g(E)  =  T  contains  more  than  integer  lattice  points,  this  is  not 
necessarily  the  case  as  the  following  example  shows. 

Example  6.1  Let  E=  {(0,0),  (l,0),  (0,l),  {fz/k,  1/2)}.  Let  M  =  U*=Q  M, 
where  =  {(0,0,0)}  and 

Mn  =  {(x,y,n):  (x,y)  £En  and  n£N}  (6  ,U) 

where  E^  is  recursively  defined  as 

E^  =  E  and  En  =  E  +  En_^  for  n  =  2,3,...  (6.5) 

and  the  composition  on  (M,#)  is  defined  as 

(x,y,n)  *  (x',  y',  n’)  =  (x  +  x',  y  +  y',  n  +  n*).  (6.6) 

Let  v(x,y)  =  1  for  all  (x,y)  in  E  and 

t(x,y)  =  (x,y,l)  .  (6.7) 

The  completion  of  the  exponential  family  generated  hy  V  and  t  would 
consist  of  all  probability  measures  with  support  equal  to  E  and  the  probability 
degenerate  in  (l,0),  (0,0)  and  (0,l).  Because  (Vi/ 4,  1/2)  is  in  the 
interior  of  the  convex  hull  of  E,  no  probability  in  the  family  would  be 
degenerate  at  this  point. 

The  subset  F  of  M  given  by 

F  =  j(n  X’  l) :  n€Ij|  (6-8) 

is  obviously  a  face  of  M.  Hence  the  general  exponential  model  corresponding 
to  V,  t  and  M  will  contain  the  probability  degenerate  in  ()f2A,  1/2). 
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This  example  illustrates  the  essential  difference  between  the  completions 
defined  by  Barndorff-Nielsen  (1973)  and  the  models  in  this  paper:  the 
"completions"  are  defined  via  geometrical  concepts  in  R^"  or  via  topological 
considerations  whereas  the  general  exponential  models  are  derived  via  algebraic 
structure  in  the  statistics,  thus  letting  the  actual  observations  play  a  more 
prominent  role.  If  one  after  n  experiments  in  the  above  example  obtains 
the  value  of  the  statistic  to  be  (n  ^2/^,  n/2),  this  must  be  because 
(j2/k,  1/2)  was  observed  n  times.  This  is  reflected  in  the  estimated  pro¬ 
bability  measure,  which  will  be  degenerate  at  (  <[2/k,  1/2)  as  can  be  seen 

V 

from  proposition  1*.3. 
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